-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: teach replicateQueue to handle non-voter addition/removal #59403
kvserver: teach replicateQueue to handle non-voter addition/removal #59403
Conversation
c1188c8
to
a5b5360
Compare
I'm still fixing a few broken tests, but I thought I'd open it up for review while I do that. |
a5b5360
to
71c6154
Compare
2706492
to
4868cb0
Compare
b73c4a2
to
5634c53
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave the last commit a pass. You mentioned that this is still a WIP and is based on #59389. Is this blocked on that PR?
Reviewed 1 of 7 files at r7, 10 of 10 files at r8.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @andreimatei)
pkg/kv/kvserver/allocator.go, line 544 at r8 (raw file):
} type chooseReplicaToAddFn func(
I don't find that the indirection through chooseReplicaToAddFn
or chooseReplicaToRemoveFn
provides much clarity, because the corresponding functions (e.g. allocateVoterCandidates
vs. allocateNonVotersFromList
and even allocateVoterCandidates
vs. allocateNonVoterCandidates
) are almost identical. So this is leading to a lot of code duplication and it's also making it quite difficult to understand where the logic isn't identical. As a reader trying to understand this, I'd like to more easily be able to point to one or two places where this distinction matters (e.g. using constraint.EmptyAnalyzedConstraints
for voter constraints when allocating a non-voting replica) and know that it doesn't in all other places.
This makes me wonder if we're splitting code paths too early. Did you explore how this code would look with a voterTarget bool
parameter plumbed through a few of these layers and resorted to targetted branching on this condition?
pkg/kv/kvserver/allocator.go, line 763 at r8 (raw file):
// set. It first attempts to randomly select a target from the set of stores // that have greater than the average number of replicas. Failing that, it // falls back to selecting a random target from any of the existing
"existing voting replicas"
pkg/kv/kvserver/allocator.go, line 779 at r8 (raw file):
// provided set. It first attempts to randomly select a target from the set of // stores that have greater than the average number of replicas. Failing that, // it falls back to selecting a random target from any of the existing replicas.
"existing non-voting replicas"
pkg/kv/kvserver/allocator_scorer.go, line 1003 at r8 (raw file):
valid = true matchingStores := analyzed.SatisfiedBy[i] // TODO DURING REVIEW: Not directly related to this patch, but shouldn't
Reading the comment on the function, I think the idea is that the target store is necessary if "it matches a constraint that is not already fully satisfied by existing replicas". If len(matchingStores) == int(constraints.NumReplicas)
then the constraint is already satisfied without store
, assuming store
itself is not accounted for in analyzed.SatisfiedBy
. I don't think it is, given the comment on removeConstraintsCheck
, which says "The difference between this and allocateConstraintsCheck is that this is to be used on an existing replica of the range, not a potential addition.".
Mind leaving a comment?
pkg/kv/kvserver/replicate_queue.go, line 348 at r7 (raw file):
/* removeIdx */
Looks like we accidentally lost this.
pkg/kv/kvserver/replicate_queue.go, line 429 at r7 (raw file):
existingReplicas []roachpb.ReplicaDescriptor, liveVoterReplicas []roachpb.ReplicaDescriptor, removeIdx int, // -1 for no removal
Looks like we lost this comment.
pkg/kv/kvserver/replicate_queue.go, line 123 at r8 (raw file):
// ReplicateQueueMetrics is the set of metrics for the replicate queue. // TODO(aayush): Track metrics for non-voting replicas separately here.
👍 are you planning to do that in this PR, or should we open an issue to track?
pkg/kv/kvserver/replicate_queue.go, line 550 at r8 (raw file):
} func (rq *replicateQueue) addNonVoter(
Let's give this a comment.
pkg/kv/kvserver/replicate_queue.go, line 778 at r8 (raw file):
} if err := rq.changeReplicas(ctx, repl,
nit: let's keep the wrapping the same as in all other callers, so it's easier to see where this differs from them.
I was intending for it to not be a WIP anymore. It is blocked on #59389 though. Your comment regarding the code duplication captures the gist of what I most wanted your opinion on. Thanks for taking a look. I'll make that change. |
3b4f9c1
to
b4359a9
Compare
c3b6176
to
bce8317
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @nvanbenschoten)
pkg/kv/kvserver/allocator_test.go, line 2639 at r11 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: was this intentional?
Fixed.
Build failed (retrying...): |
Build succeeded: |
The commit is very similar to cockroachdb#59403. Previously, non-voters on dead or decommissioning nodes would not get removed or replaced. This commit fixes this behavior. Release justification: fixes limitation where dead/decommissioning non-voters would not be removed or replaced Release note: None
The commit is very similar to cockroachdb#59403. Previously, non-voters on dead or decommissioning nodes would not get removed or replaced. This commit fixes this behavior. Release justification: fixes limitation where dead/decommissioning non-voters would not be removed or replaced Release note: None
The commit is very similar to cockroachdb#59403. Previously, non-voters on dead or decommissioning nodes would not get removed or replaced. This commit fixes this behavior. Release justification: fixes limitation where dead/decommissioning non-voters would not be removed or replaced Release note: None
61439: sql: reference sequences used in views by their IDs r=ajwerner a=the-ericwang35 Followup to #59864, fixes #61205. Previously, a change was introduced to store sequences used in DEFAULT expressions to be referenced by their IDs instead of their names, to allow sequence renaming. In this patch, we want to do something similar by rewriting sequence references in views to be referenced by their IDs. This allows sequences used in views to be renamed. Another thing this PR does is change the output of sequences to look like `'s'::REGCLASS` rather than `'s':::STRING::REGCLASS`, since we're no longer type checking them (since it's unnecessary). Release note (sql change): reference sequences used in views by their IDs to allow these sequences to be renamed. 61682: kvserver: teach replicateQueue to replace dead/decommisioning non-voters r=aayushshah15 a=aayushshah15 The PR is very similar to #59403. Previously, non-voters on dead or decommissioning nodes would not get removed or replaced. This commit fixes this behaviour. Resolves #61626 Release justification: fixes limitation where dead/decommissioning non-voters would not be removed or replaced Release note: None 61967: kvserver: stop spuriously refusing non-voters in replicateQueue.shouldQueue() r=aayushshah15 a=aayushshah15 Before this commit, the `replicateQueue` would refuse to queue up a replica into the queue (in its `shouldQueue` method) for the rebalancing case unless it could verify that a voting replica needed to be rebalanced. This was an unfortunate oversight since it meant that unless there was a voting replica to be rebalanced, non-voters would not get rebalanced by the queue. This commit fixes this bug. Noticed while debugging a flakey test for #61682 Release justification: bug fix Release note: None 62009: build: perform `bazel clean --expunge` on `make clean` r=rickystewart a=rickystewart Release note: None 62186: sql: ensure user has correct privileges when adding/removing regions r=richardjcai a=arulajmani Previously we did not account for privileges on database objects when adding the default locality config on first region add or removing the locality config on last region drop properly. In particular, we weren't adding/removing the locality config on any descriptor that wasn't visible to the user. This is bad because our validation logic expects only and all objects in multi-region databases to have a valid locality config. This means future accesses to such descriptors would fail validation. The root of this problem was the API choice here, `ForEachTableDesc`, which filters out invisible descriptors. This patch instead switches to using `forEachTableInMultiRegionDatabase`. While here, instead of issuing separate requests for every table, I refactored this thing to issue a single batch request instead. Now that we view all the descriptors inside the database, unfiltered, we perform privilege checks on them before proceeding with the add/drop operation. In particular, the semantics are: - admin users are allowed to add/drop regions as they wish. - non admin-users require the CREATE privilege or must have ownership on all the objects inside the database. Closes #61003 Release note (sql change): `ALTER DATABASE .. SET PRIMARY REGION` now requires both CREATE and ZONECONFIG privilege on all objects inside the database when adding the first region to the database. Same for dropping the last region using `ALTER DATABASE ... DROP REGION`. Co-authored-by: Eric Wang <[email protected]> Co-authored-by: Aayush Shah <[email protected]> Co-authored-by: Ricky Stewart <[email protected]> Co-authored-by: arulajmani <[email protected]>
The commit is very similar to cockroachdb#59403. Previously, non-voters on dead or decommissioning nodes would not get removed or replaced. This commit fixes this behavior. Release justification: fixes limitation where dead/decommissioning non-voters would not be removed or replaced Release note: None
This commit primarily teaches the allocator to be able to rank
non-voting replica candidates for addition and removal. This allows us
to have the replicateQueue execute upon the allocator's actions to add
or remove non-voting replicas to a range.
Note that this commit does not deal with rebalancing of non-voting
replicas, just simple additions and removals when a range is over or
under-replicated.
Release note: None